EPJ manuscript No. 

(will be inserted by the editor) 



Geometric study for the Legendre duality of generalized entropies 
and its application to the porous medium equation 

Atsumi Ohara 

Department of Systems Science, Osaka University, 1-3 Maciiiliane-yama, Toyonaka, Osaka, 560-8531 JAPAN 
e-mail: oharaOsys . es . osaka-u. ac . jp 

Received: / Revised version: 

Abstract. We geometrically study the Legendre duality relation that plays an important role in statistical 
physics with the standard or generalized entropies. For this purpose, we introduce dualistic structure 
defined by information geometry, and discuss concepts arising in generalized thermostatistics, such as 
relative entropies, escort distributions and modified expectations. Further, a possible generalization of these 
concepts in a certain direction is also considered. Finally, as an application of such a geometric viewpoint, 
we briefly demonstrate several new results on a behavior of the solution to the nonlinear diffusion equation 
called the porous medium equation. 

PACS. 89.70.Cf Entropy and other measures of information - 02.40.Hw Classical differential geometry - 
05.90.-|-m Other topics in statistical physics, thermodynamics, and nonlinear dynamical systems 



' 1 Introduction 

' In recent decades study of physical or artificial systems not 
. obeying the usual Boltzmann-Gibbs statistical mechan- 
' ics has received an increasing attention. For examples of 
I such systems see |ll2l3l4j and the references therein. Their 
■ common nature would be that the Boltzmann distribution 

does not correspond to their equilibriums. One of main re- 
. search directions to overcome the difficulty of analysis for 

such systems is generalizing the notion of entropies within 

the framework of statistical physics. 

In this generalization, the Legendre duality relation is 
[ still of fundamental importance and is required to pre- 
. scribe a link between intensive and extensive parameters. 
' In statistics this nice structure has been well exploited via 

information geometry [516] for the standard exponential 
'■ family, and the results are successfully applied mainly to 

statistical estimation, information theory, learning theory 

and so on. 

The purpose of this paper is to study the Legendre 
duality relation of generalized entropies from information 
geometric viewpoints and provide new insights and tools 
with this field by showing their usefulness through several 
applications. 

In the aspect of geometric structure with Legendre du- 
ality, there may be at least two major directions to gen- 
eralize the notion of entropies from the standard one. In 
section 2 we characterize the difference of these two meth- 
ods in terms of a pair of representing functions for distri- 
butions. The one method always fixes one of representing 
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functions to the identity map while the other varies both. 
The former leads to the geometry induced by the Breg- 
man divergence. The latter includes the a- geometry as a 
special case. 

From section[3]to[7l we discuss the Tsallis statistics, its 
generalization and applications. Section[3]presents a short 
review of the relation of the a-geometry with the Tsallis 
statistics and emphasizes its importance. In section |4] we 
reconsider the construction of the a-geometry by the affine 
surface theory |7I8I9| as preparation to a more generalized 
setup. The geometrical relation of two manifolds of the or- 
dinary and the escort distributions are discussed. Section[5] 
proposes a generalization of the a-geometric structure and 
the associated divergences using a certain class of convex 
functions. It is seen that the centro-affine immersion [9] 
is essential to conserve the dualistic structure. In section 
[6] and [7l we demonstrate applications of the introduced 
geometric notions to exploit the properties of generalized 
entropies. Section [6] shows the relation between modified 
averages (expectations) and convexities, which plays an 
important role in minimizing relative entropies under the 
average constraints. In section[7l we prove that the trajec- 
tory of the gradient flow for the a-divergence is a geodesic 
curve and it possesses several constants of motion. 

Finally in sections [5] and [51 we introduce the so-called 
Bregman divergence [10], the associated generalized en- 
tropy and geometry behind these quantities studied in 
[11I12I13I14| . The feature is that the linear averages of 
the extensive physical quantities naturally appear in this 
setup. As an application, we show several new results on 
the behavior of the solutions to the porous medium equa- 
tion (PME). The behavior is characterized in terms of 
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geometric concepts induced on a generalized exponential 
family called the q- Gaussian densities. This family can be 
proved an invariant manifold for the PME. Further, we 
show that the trajectory of the solution on the manifold 
coincides with a geodesic curve with respect to the one of 
the dual afhne connections. In addition the convergence 
rate to the manifold is evaluated. For this part, a full de- 
scription with complete proofs can be found in \16\ . 



2 Statistical model and dualistic structure 

Let P(;{x) = p{x; C) be a probability distribution for ran- 
dom variable x (or, density function for the continuous 
random variable) parametrized by a finite-dimensional pa- 
rameter vector C, = (C^, • ■ • , C") € Z, where Z is a certain 
domain in R". We call the set of the statistical model 
and denote it by A^. The concrete examples in this pa- 
per are the probability simplex ([3]) and the g-Gaussian 
densities (jHJ. We usually assume that Ai satisfies sev- 
eral regularity conditions such as smoothness of the map 
C I— > p^, commutativity of integrations and differentiations 
and so on. See [5|6j for details. 

It is known that the standard Boltzmann-Gibbs- 
Shannon (BGS) entropy is maximized on the the Boltz- 
mann distribution (exponential family) with the expecta- 
tion constraint of the Hamiltonian. Similarly, each general- 
ized entropy is maximized on the corresponding statistical 
model with the constraint. See, for example. Remark [T] in 
section El This is the major reason that motivates us to 
study structure of specific statistical models focusing on 
the Legendre duality of generalized entropies. 

Information geometry is a convenient framework for 
this purpose. In order to introduce the geometric structure 
on the statistical model, we define the following quantities: 



^ij.kiO ■- 



dL{pc)dL*{p^) 



dx, 



d'Ljpc) dL*{p^) 
dL{p^)d'L*{p^) 



dx. 



dx. 



Here, a pair of one-to-one and smooth functions L{u) and 
L*{u) on u > are called representing Junctions for the 
distribution pq, which determines the Legendre duality 
such as pairs of dual coordinate systems (physically, ex- 
tensive and intensive parameters) or potential functions 
conjugate each other. We use g := [gij) as a Ricmannian 
metric on and 



which is equivalent with (|54p in Appendix A. This rela- 
tion is important for the geometric study of the Legendre 
duality. 

The typical cases are classified as follows: 

i) The standard information geometry corresponding to 
the BGS entropy and KuUback-Leibler relative entropy 
is derived by 



L{u) 



L*{u) = Inu. 



ii) The a-geometry corresponding to the Havrda-Charvat- 
Tsallis entropy @ and Tsallis relative entropy ^ uti- 
lizes 



i(u)=L(")(u) :=- 



,(l-")/2 



L*(u) =l(-")(u) 



This class, its generalization and applications are dis- 
cussed from section [3] to [71 
iii) Information geometry called the U- geometry is 

corresponding to the Bregman-type divergences [11112113114115] 
and the associated generalized entropies. It is repro- 
duced from 

L{u) — w, L*{u) = ln0(w), 

where In^ is a generalized logarithmic function defined 
by, e.g., 132]) [13ll4ll5j . This class and its applications 
are discussed in section |8] and M 

Because L{u) = u in the cases i) and iii), the linear av- 
erage naturally appears and plays an important role. In 
these cases, the corresponding connections V and V* are 
called (generalized) mixture and exponential connections, 
respectively. On the other hand, in the cases i) and ii) the 
obtained Riemannian metric g coincides with the Fisher 
information, i.e., 



(2) 



This is a very important point in applying the geometry 
to statistical inference. We see from © that the relation 



dLdL* 
du du 



1 



should be satisfied for g to be the Fisher information ma- 
trix. Hence, the Riemannian metric in the case iii) is not 
generally the Fisher metric. However, it is physically in- 
terpreted as a susceptance matrix via the linear average 
(See section [S]). 



1=1 



kl 

9 r,. 



1^1 



9 



as the components for two affine connections V and V* 
on A4, where g'^ ig the component of the inverse matrix 
of g. Then the above definitions imply that the following 
duality relation of the connections '5'6' holds: 



8tgjk 



r. 



ik,j ; 



(1) 



3 Review of Tsallis entropy via 
alpha-geometry 

Let iS" denote the n-dimensional probability simplex, i.e. 

11+ 1 ~i 



5" -.^{p^ (p.) 



Pi 



>0, = 1 



(3) 
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and pi,i = 1, ■ ■ ■ ,n+l denote probabilities of n + 1 states. 
The set S" is an example of the statistical model with 
parameters pi,i = l,---,n. The function Sq defined on 
5", the closure of 5", for a real parameter nor 1) 

by 



n+l 

Sq{p) :=-fcV(K)«ln,K 



1=1 



(4) 



is called the Havrda-Charvat-Tsallis (HCT) entropy |17I18| . 
where In^ is the q-logarithmic function defined by In, x = 
{x^~'' — 1)/(1 — g). Note that the HCT entropy converges 
to the BGS entropy when q goes to one. Hereafter, the 
positive constant k is set to one for the sake of simplicity. 
The HCT entropy is concave if g > and convex if g < 0. 
It does not satisfy the additivity, i.e., it holds that 

Sgip^r) = Sq{p) + Sq{r) + (1 - q)Sq{p)Sg{r). (5) 

for p = (pi) e 5", r = (rj) e 5™ and pi^ r :— {piVj) e 
gnm+n+m ^ The relation ^ is called nonextensivity. See, 
for details, a recent review paper ^ . 

For several reasons the following quantity Kq is intro- 
duced in [19l20l21l22j as a relative entropy between two 
probability distributions p and r in 5", which is of the 
form: 



Kq{p,r) 



Pi Inq — 

/ n+l 



i=l 



1 



(6) 



i=l 



When q is positive, Kq{p, r) > and the equality holds if 
and only if p = r . Note that Kg converges to the KuUback- 
Leibler divergence as q goes to one. For the uniform distri- 
bution u = (ui) with Ui — l/(n+ 1) for all i = 1, • • • 1, 
it holds 



Kq{p, U) 



1 



1 



1-9 



{Sq{u) - Sq{p)}. 



Hence, if g > 0, the maximizing the HCT entropy Sq{p) 
is equivalent to minimizing Kq{p,u). 

On the other hand, the quantity called the a-divergence 
|5l6j D*^"'' has been used in mathematical statistics, which 
is defined with a real parameter ±1) by 



n+l 



" I i=i 

for two probabilities p and r in 5". By equating 

(l-a)/2, 



)/2 



(7) 



we see that the Tsallis relative entropy Kq coincides with 
the a-divergence on 5" up to constant, i.e, 



D^''\p,r) = -Kq{p,r) 



(8) 



Note that D^"^ is nonnegative regardless to a and positive 
if and only if p 7^ r. It also converges to the KuUback- 
Leibler divergence when a — > — 1 and it is convex with 
respect to p and r if — 1 < a < 1. 

It is known that the a-divergence induces a differential 
geometric structure on 5" with a Riemannian metric and 
an affine connection denoted by g and V^"\ respectively. 
We call it the a-geometry 5 6 . While the way to induce 
from the a-divergence is omitted here, the resultant com- 
ponentwise expressions of g and V*^"-' are as follows: Let 
di be a natural basis tangent vector field on 5" defined by 



di -.^ 



d 



dpi dpn+l 



i = 1, - ■ ■ ,n. 



(9) 



Then, the induced Riemannian metric is nothing but the 
Fisher information matrix, i.e., 



, (p) := g{di, dj) = — % + ^— 

Pi Pn+l 



(10) 



The induced affine connection V*-"-* is called the a-connection, 
which is represented in its coefficients by 



1 



1 



^i] +Pk9tJ 

Pk 



hJik— 1, • • • , n, 

where S'^j is equal to one if z = j = fc and zero otherwise. 
Then we have its covariant derivatives by 



k=l 



There are two specific features for the a-geometry on 
5" induced in such a way. First, the triple (5", V*^"') 
is a statistical manifold (See Appendix for its definition), 
i.e., we can confirm that the following holds: 



Xg{Y, Z) = 5(V^"V, Z) + g{Y, V'P> Z) (12) 

for arbitrary vector fields X, Y and Z on 5". Thus, V*^"-' 
and V'^""^ are mutually dual. The relation (fT2|) is closely 
related with the Legendre duality. 

Another feature is that (iS", g, V^"-') is a manifold with 
constant curvature k = (1 — a^)/4 = qil — q), i.e., it holds 
that 

i?(")(A, Y)Z - K{g{Y, Z)X - g{X, Z)Y}, 

where i?*-"-* is the Riemann-Christoffel curvature with re- 
spect to V^"^ Because of this property the a-divergence 
meets the modified Pythagorean relation for p, q and r, 
which form a "right triangle" on 5" with respect to g and 
V(±"), i.e.. 

Proposition 1 Let^'^"^ and^'^^"'^ be respectively the V'^°''> ■ 
geodesic joining p and q, and the S/^""^ -geodesic joining 
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q and r. Ifj^"^ and 7^ are orthogonal at q with respect 
to g, then it holds 

(p, r) = Z?(") (p, q) + {q, r) 

~KD'^'^Hp,q)D'^°'\q,r). (13) 

This relation is quite important in studying the prop- 
erties of the HCT entropy Sq because its nonextensivity 
(O is straightforwardly derived from p^ . It means that 
nonflatness {k 7^ 0) of the manifold 5" is geometrically in- 
terpreted as a direct cause of the nonextensivity [23] . Fur- 
ther, (jl3p ensures the uniqueness of the equilibrium dis- 
tribution minimizing the Tsallis relative entropy Kq with 
constraints given in terms of the normalized q-expectation 
[23] , (See also the discussion in section [6]) 



4 Escort distribution from a viewpoint of 
affine immersion 

In the previous section the a-geometry is introduced from 
the a-divergence. Another way to construct the a-geometry 
on iS" is using the affine immersion [9] of 5" into R""*"^ 
equipped with the standard flat connection D. The ad- 
vantage of this method is that the escort probability nat- 
urally appears accompanying with the setup and its geo- 
metrical meaning is elucidated. Hereafter, we assume that 
1 > g = (1 — a)/2 > to simplify the discussion. Sev- 
eral concepts of affine immersion are summarized in the 
appendix. For detail, see the references. 

Let 6 = (0*), i — 1, • • • , n -I- 1 be the standard coor- 
dinate system of the vector space R"+^ with respect to 
{o; ei, • • • , e„+i}, the origin as zero vector and the natu- 
ral basis vectors. Denote by R"^^ the positive orthant of 
R"+i. 

Consider the immersion / of 5" into R'l+i by 

/ : p = ip,) ^e= (6^) = (L(")(p.)), * = 1, •••,«+ 1, 

(14) 



where the representing function L^") is defined by 

L(")(t) ^_i(i-")/2 = it?. (15) 



1 — a 



q 



Note that /(5") is a level surface ipi9) = 2/(1 + a) in 
of the function defined by 



HO) 



a + 



ri+l . 



2/(1-") 



1 



1-9 



ri+l 



i=l 



(16) 

By the assumption for the range oi q = (1 — a)/2, the 
function ■0 is convex with the positive definite Hessian 
matrix on R""*"^. 

At this stage, we still have a freedom of choosing the 
transversal vector ^. The freedom induces realizations of 
the different geometric structure of (5",/i, V). Here we 
take ^ as 



n+l o 

Ye— 



(17) 



This choice of is derived by 



1 



E = -- 



■E, (18) 



where E = Yl!i=i d / dO^ is the vector field defined to 
satisfy 



'^"^ ^'^'-f.%^'-d^W. (19) 



de^dei 



i=l 



for an arbitrary vector field X = '£'^=1 X'd/d9' on R!|;+^ 
Hence, if X is tangent to f{M), then the right-hand side 
of (Uni) vanishes. However, since the Hessian (d'^tp/dd'^dO^) 
is positive definite, E and ^ are guaranteed transversal to 
f{M). Further, we see from (|17p that the immersion (/, ^) 
is centro-ajfine with a scaling constant q(l — q). 

As summarized in the appendix, the centro-affine im- 
mersion realizes a statistical manifold (5", /i, V) with con- 
stant curvature. Actually, keeping in mind the relations 



and 



we have 



-D^;^=0, i,j = 1,- • -,71+ 1, 



De^ = {q-l)^-^^ z,j = l,...,n+l. 
dpj Pi opj 

Using these relations, we can calculate the Gauss and 
Weingarten formulas (Cf. Appendix) for di defined in ([9]) 
and ^ as follows: 

^aA = (.-l)(^^ ' ' 



_ Pi dpt Pn+l dpn+1 
n 



k=l 

D9A = q{q-i)l^- 



d 



{ dpi dpn+i 



Here, we have used the expression: 

n+l „ 

which is equivalent to pT]) . Then, we see that the calcu- 
lated hij and Fjj respectively coincide with pU)) and PT|) . 

i.e., it holds that hij = gij and F^j = E^"'"'^ . Thus, the re- 
alized manifold coincides with (5", g, V*^"^). Further, the 
affine shape operator S = (s^) and transversal connection 
form T are, respectively, 

sl = {l-q)q5l T = 0. 
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By the property F3) in the appendix, these two relations 
show that the realized manifold (5", g, V is a statisti- 
cal manifold with constant curvature k = q{l — q). 

This viewpoint clarifies the relation with the escort 
probability. Using the coordinates (6*'), the escort proba- 
bility Pi is expressed 1^ by 

(P^)" ^'(P) ^-l...n+l 

Hence, the set of escort probability distributions P = (Pi) 
with positive Pi, is nothing but the probability simplex 
in the ambient space R"+^. We denote this set by f". 
Recall, on the other hand, that the immersion /(iS") is 
represented as a level surface of -0 in R"^^ (See Figure 
[T]). Thus, for each (p^), we can define a projection tt from 



0n+l 

0{p) 




Fig. 1. Projective transformation tt from f{S") to and the 
escort distribution P. 

to by 

where 

Now we consider geometric structure of £" in order 
to derive interesting properties of the escort probabilities. 
Since f " is contained in a hyperplane of R"+^, it would 
be natural to use the flat connectioiQ induced from D. 
We use, for the brevity, the same symbol D for the in- 
duced connection on Then note that any straight line 
segment on is a geodesic of (£", D). 

Let T be the 2-dimensional sector defined by 

r {6>|6>-/5i6>(p)-t-/326»(r), /?i > 0, /32 > } . 

^ It corresponds to the mixture connection [6] for the escort 
distributions (Pi) in the terminology of information geometry. 



Then it is known [9, p. 44] that the V'-"'-geodesic curve 
7*^") connecting p and r is represented on by 

/(7("')-/(5")nr. (20) 

On the other hand, T also includes points P{p) and 
P(r) by the definition of the escort probabilities. The in- 
tersection £" nT is a straight line segment, i.e, a geodesic 
of {£,D), connecting P{p) and P{r) (Figure[T|). 

Thus, TTo/ maps every V^^^-geodesic curve on (5", V^")) 
to a geodesic (line segment) on (£", D). In this sense, nof 
is a projective transformation from (iS",V^"^) to a flat 
manifold (5",D). The above observation is summarized 
as follows: 

Proposition 2 The escort distribution (Pi) is geometri- 
cally interpreted as a normalized affine coordinate system 
of the manifold £" with the fiat connection D, which is 
projectively transformed by tt o f from the probability sim- 
plex iS" with the connection V^"^ . 

There are, at least, two possibilities A) and B) to in- 
troduce geometric structure to the flat manifold (£",£)) 
equipped with the Legendre duality. 

A) The one is to consider the KuUback-Leibler divergence, 
i.e., — 1-divergence, for two escort distributions P = (Pi) 
and P' = (PI) on 

"+i p 
D(-i)(P,P') = 5:Pan^. 

i=l ' 

Then f " is the well-developed dually flat statistical mani- 
fold [516] with the Fisher information matrix as a Rieman- 
nian metric. The obtained flat connection D* on f " dual 
to D is called the exponential connection. This structure 
is helpful when we apply the standard technique of sta- 
tistical physics to the escort distributions (, e.g., [1]) and 
translate to the usual distributions. 

B) The other possibility is to induce from the geometry of 
the ambient space R"^^ by regarding as its submani- 
fold. We do not describe the detail (See the note below), 
but we only show the corresponding divergence on 

Let f{ri) be the Legendre transform of ^p{6), i.e., 

(^(77)- sup j|]0'r,,-^(0)| 
eeR"+' U=i J 

i—l ^ ' 

Construct the canonical divergence T) 5 6 on R""*"^ x 
R""^^ using ip and the dual parameters rji, then for two 
points and 0' in R"^^ we have 

n+l 

v{e,0') ■.^m + ^{ri')-Y.(^% 

i=l 
n+l 

= i;{e)-ij{0')-J2rjl{0^-0'^). (22) 

i=l 
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The divergence V reproduces the a-divergence (O on f{S^) 
f{S") due to the relations US]) and 

r,,(p)=L(-")(p,) = z = l,...,n + l. (23) 

Simultaneously, it induces the following natural divergence 
for two escort distributions P = {Pi) and P' = (P/) on 

n+l 

V{P, P') = ^(P) - ^(P') -Y.{P.- P[}Pt, (24) 



X Let / be an afRne immersion defined by (|14p , then S" 
is immersed in R"^^ as a level surface of ■f/''^'': 



where P* is defined by 



P* 



Note: The canonical divergence defines the flat sta- 
tistical manifold structure (R"^^, g,Z3) with the Rieman- 
nian metric g: 



9ij 



_d d_ 

' WW} 

dtp 



8=9' 



(25) 



and the flat connection D. Further, it induces a geometry 
on 5" via /(5") as the statistical submanifold of R"^^. 
This is the essentially same way to [23] or section |3l We 
have seen in this section that the induced geometry coin- 
cides with the one defined by the affine immersion (/, ^). 
This is due to the special choice of the transversal vector 
^ satisfying HI]) and HI]). See for detail [24.25, .26j. 

The manifold (R"'^^,^, 13) also induces geometry on 
the submanifold It can be proved to be a flat statisti- 
cal manifold and its corresponding divergence is given in 
(Pl)) . The induced geometry is known to be not only pro- 
jectively but also conformally transformed by tt o / from 
(5",g,V("^), which directly follow from the concept of 
— 1-conformal equivalence [5]. 



5 Generalization via centro-affine immersion 

In this section we discuss a possible generalization of en- 
tropy, relative entropy (divergence), escort probability keep- 
ing statistical manifold structure of 5". The key idea is by 
applying the method of centro-affine immersion observed 
in the previous section. 

Consider a smooth representing function s = L{t) de- 
fined on t > satisfying the following assumptions: 

Al 
A2 
A3 



strictly increasing, 
L(0) = 0, 

(PL/dt^ < for aU t > 0. 



These assumptions ensure that the strictly increasing in- 
verse function exists and meets conditions: L~^{0) — 
and (PL"^ /ds^ > for all s > 0. Hence, L^^ is a convex 
function. 



n+l 



Here, V'*-^'' is a convex function defined by 



n+l 



which has a positive definite Hessian from the assump- 
tions. The level c is arbitrary. For the case of the a- 
geometry we have used L = L^"^ and c = 2/(1 -1- a). 

We take a transversal vector ^ so that the affine im- 
mersion {/,£,) is centro-affine with a scaling constant k: 



n+l 



d 



where k is an arbitrary constant. Since f{S'^) is a strongly 
convex surface, ^ defined above is transversal to f{S"'). 
Hence, according to F3) in the appendix, we find that 
(/, ^) realizes a statistical manifold (5", V'^^-') with 
constant curvature k. 

Instead of the canonical divergence, we invoke the ge- 
ometric divergence jS]: 



n+l 



D(^Hp,r) :=-^i.,(r)((?Xp)-e^(r-)), 



(26) 



where ly = VidO^ e (R"+^)* is called the conorvaal 

vector [9J deflned by 



where 



1 av^^^ 



"+1 f),l,(L) 



(27) 



Here, (R"+^)* denotes the dual space of R"+^ and i^i plays 
a similar role of the dual parameter rji in the case of the 
a-geometry. 

It is known [8] that the analogous statement to Propo- 
sition [1] holds for D'^^\ i.e., for a right triangle on 5" with 
respect to the realized Riemannian metric ft,^^^ and dual 
connections V(^) and V(^)*, it holds that 

i^(^) (p, r) = (p, q) + (q, r) 

-kD''^\p,q)D^^\q,r). (28) 

Using the uniform distribution it, we can define an asso- 
ciated entropy S'^^^ with the divergence D*^^^ by 



(29) 
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where 

M := ma.xD^^'>{p,u). 

Then 5^^-* is confirmed to meet positivity, convexity, con- 
tinuity and take the maximum at p = m. The modified 
Pythagorean relation is essentiaUy important to in- 
vestigate the minimization of the generalized divergence 
or maximization of the generalized entropy S^^\ as 
is in the case of Kg or 5*^ j23j . 

The generalized escort probability (see also [I]) is sim- 
ilarly defined by 

p(L) Hp,) 

The projectivity and the property in Proposition [2] are 
inherited to this generalization. 

However, unlike the case of the previous section, the 
divergence defined on R"+^ by ^ with instead of 

tp does not generally induce D*^-^) in ((26l) on 5". By com- 
paring ([22|l and ((26|) . we see that the property holds if and 
only if Vi coincides with the dual parameter of 0*, i.e., 

'''^~d9^' z = l,---,7i + l. 
From (P7)l it is equivalent to the condition that A = 1, i.e., 

i—l 

holds on Thus, the property is specific to the case 

L is a power function like ([T5| . 

6 Application (1): Alpha-convexity and 
autoparallelism of submanifolds constrained 
by expectations 

In this section, we apply the geometry (5", g, V'-"^) to dis- 
cuss convexity of submanifolds constrained by the modi- 
fied expectations (averages). Convexity of the submanifold 
is crucial in considering the maximization of Sg. Here we 
restrict to the case that q > 0,q ^ 1. The results can 
be generalized to (5", h'-^\ V'^') via the analogous argu- 
ments. 

The subset A in 5" is said V^"^ -autoparallel, if it holds 
that 

f{A) = /(5") n r 

for a certain open convex set T contained in a linear sub- 
space of R"+^ with respect to the ^-coordinate system. 
The characterization of the V^^^-geodesic curve of ([20]) is 
the special case where the dimension of the linear sub- 
space is two. The subset C in 5" is said V^"^ -convex if 
the a-geodesic curve connecting arbitrary two points on 
C is contained in C. Note that A is V^"^-convex if it is 
V*^"^-autoparallel. 



The V'"-* -convexity or V^^^'-autoparallelism are impor- 
tant in studying the Tsallis relative entropy minimization 
from a viewpoint of the optimization. It is because the 
modified Pythagorean theorem guarantees the uniqueness 
of the minimizing distribution on the V^"-*-convex set. In 
this case, the minimizing distribution is characterized by 
the so-called —a-projection 5 6 23J. 

Let Ai be the quantity assigned to the i-th microstate 
and consider the linear, q- and q-normalized expectations 
[27] . respectively defined by 

For a prescribed value A, define the constrained man- 
ifolds in 5" by 

n+ := 5" n {p\ {A) > A}, H+ 5" n {p\ {A)g > A}, 

n+ ■.= s-n{p\{{A))g>A}. 

Similarly, , Hg , Hg , and the boundaries H, Tig, Ti, can 
be defined by replacing the inequality symbols by the re- 
verse and the equality ones, respectively. 

Since the constraints for Ti^ ,Hg or Hg are non- 
linear with respect to p, the corresponding equilibrium 
distributions for the Tsallis relative entropy Kg{-,r) (or 
a-divergence D'^°^\-, r)) are not necessarily unique nor the 
minimizers for them, while they are convex in the usual 
sense with respect to p when q> Q and 9 7^ 1. 

Let the constrained subsets be nonempty. Then they 
have the following properties: 

1) v.- (resp. n+) is V(")-convex if < g < 1 (g > 1) and 
Ai> Q for aU i, 

2) {Hg) is V(")-convex if < g < 1 (1< g), 

3) Let < g < 1 (1 < g) and < A, (0 > Ai) for aU i. 
Then Tig is a D'^"^ -sphere, i.e., 

=5"n{p|Z?(")(p,r) = 4 

for some r € 5" and d G R, 

4) Both Tig are -convex, 

5) Tig is a V'-"-'-autoparallel submanifold in 5". 

All of the above statements follow from the standard 
convexity argument with respect to the ^-coordinate based 
on the definition of the a-convexity. In particular, note 
that constraints {A)q = A and {{A))q = A are respectively 
characterized by linear constraints in 9, i.e., 

n+l n+1 
i=l i=l 

For 3), recall that the a-divergence can be alternatively 
expressed, using (f23|) . by 

, n+l 

D^"Hp,r) = ——-J29^p)^,ir). (30) 
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Then we can verify the statement by setting 



1 



1 



9 \1 - 9 



where 



n+l 



e:=^((l-,)AO^/(i- 



For the detail of the Tsallis relative entropy minimiza- 
tion with the constraints of Hq , see [23] . 

7 Application (2): Gradient flow of the 
alpha-divergence 

As another application we investigate the gradient flow of 
the a-divergence. Let X — ^^di be the gradient vec- 

tor on iS" minimizing I?*^"' (p, r) for a given distribution 
r, where di is defined by Then the component AT' is 
expressed by 



3=1 



(31) 



where (g*-') is the inverse matrix of the Ricmannian metric 
9 = {9^j) in m- 

Since D^"^ {p, r) is strongly convex with respect to p in 
iS", the flow converges to r. Further, the constant curva- 
ture property of V^"^) gives the restriction for the 
behavior of the gradient flow as follows. For dually flat 
case, the corresponding results are found in |28] . 

Proposition 3 (constants of motion) The trajectory of 
the gradient flow p{t) of D''°'^ {p,r) on (iS", V^"^) with 
an initial point Pq coincides with the -geodesic curve 

connecting Pq and r. Further, let 9i — {9\) £ R"+^,Z — 
1, • • • , n — 1 be the set of linearly independent vectors sat- 
isfying 



71+1 



n+l 
i=l 



, n — 1. 



Then, the quantities Ci defined by 

n+l 



1=1 



are the n — 1 independent constants of motion for the gra- 
dient flow. 

See appendix B for the proof. 



8 Generalized exponential family and 
U-geometry 

Now we discuss the Legendre duality in the case of iii) 
in section H following |11|12|13|14IT5] . For a fixed strictly 



increasing and positive function (j){s) on (0, oo), define the 
generalized logarithmic function as follows: 



IMt) 



1 



ds, t > 0. 



(32) 



The generalized exponential function denoted by exp^ is 
defined as the inverse function of ln0. 

Define a convex function F^{s) for s > by 



Fff,{s) := / In^tdt, lim F^{s) < +oo :assumed.(33) 

Ji «^o+ 

For probability density functions p{x) and r{x), introduce 
a generalized entropy functional defined by 



J^p] := J -F^pix)) + (1 - p{x))F40)dx, (34) 
and the Bregman divergence defined by 



^^[pllr] := J U^{lii^r) - U^{lii^p) - piln^r ~lii^p)dx, 

where the function is the Legendre conjugate of F^ 
defined by 



U^(t) ■.= texp^t- F^{ex.p^t). 



(36) 



Let us consider the following finite dimensional statis- 
tical model called the generalized exponential family |29| 
((/)-exponential family [T^ or [/-statistical model [H]), which 
is defined by 

= {peix) = e^p^ie^h{x) - M^))\9 e 12 c R"*} 

where h{x) — {hi{x)),i = is a certain vector- 

valued function and ip^{0) is a normalizing factor of pe(a;). 
Introduce the following potential function: 



^^{9) J U^iln^pg) + {l~pg)F40)dx + ^^{9). 
It follows from the relation exp^ — that 

7^,(9) := ^^^4,{9) = I h^{x)pe{x)dx = Ep,[/i,(x)], (37) 



where di :~ 8/89'^ and we denote by Ep[-] the expectation 
operator for the density p. Then, the Hesse matrix oi'i'^{9) 
is expressed by 



did,^^{9)^ J h,{x)exp'^{9'h{x)~^j49))h^{x)dx, (38) 

where hi{x) := hi{x) — di^^{9). We see that it is posi- 
tive semidefinite because exp^ is positive, and hence, ^Z'^ 
is a convex function of 9. In the sequel, we assume that 
{didj^^) = {d7]j/d9^) is positive definite for V0 € i7. 
Hence, 77 — {rji) is locally bijective to (0*) and we call 
7] = [rji) the expectation coordinate system for A^^. By 
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the relation (|37p the Legendre conjugate of ^'0(6') is the 
sign-reversed generahzed entropy of € Mci>, i-e, 

'l';{v) = d'^V-'l'm = -^4Pe]- (39) 

Hence, ^(//{O) can be physically interpreted as the general- 
ized Massieu potential |30I31| and our Riemannian metric 
{didj^^) = {dr]j/d9^) introduced below is regarded as a 
susceptance matrix. 

The U-geometry [TT] is introduced as follows: As a Rie- 
mannian metric g — {gij) on Al^, which is an inner prod- 
uct for tangent vectors, we use the Hesse matrix of ^ij,. 
Note that we can alternatively express ([38)l as 

gij{d) = gidi,dj) -.^ didj^^ = J dtpedj In^pedx. 

Further we define a generalized version of the mixture 
connection V^™^ and exponential connection V'^^' by their 
components 

(^) = 9{^d^^dj,dk) J d.djpedk In^ pedx, 

= 5(v|,';^9j, := j dkPed.d, In^ pgdx. (40) 

Then the duality relation of the connections [5|6| holds, 

i.e., d^gjk = /^^"^ + Further, can be proved to 

be flat with respect to both V^™-' and V^'''. Thus, we have 
obtained dually flat [6] structure (5, V'™^ V^''^) on 
defined by the derivatives of W^. 

Proposition 4 Let C be a one- dimensional submanifold 
on M.^. If C is expressed as a straight line in the coordi- 
nates 6, thenC coincides with a V'^'^^ -geodesic (e-geodesic, 
in short) curve. If C is expressed as a straight line in the 
coordinates rj, then C coincides with a V^"^^ -geodesic (m- 
geodesicj curve. 

Definition 1 Let p{x) be a given density. If there exists 
the minimizing density function pg{x) for the variational 
problemmiiipg^M4,'^cl>[p\\pe], or equivalently, the minimiz- 
ing parameter 9 for the problem mmg^Q'D^[p\\pe] exists, 
we call pe{x) — Pg{x) the m-projection of p{x) to A^^. 

Proposition 5 Let pg € A4^ be the m-projection of p. 
Then the following properties hold: 

i) The expectation ofh{x) is conserved by the m-projection, 
I.e., Ep[/i(a;)] = 'Ep,[h{x)], 

ii) The following triangular equality holds: T>,f,[p\\pg\ ~ 
V^[p\\pg\ + V^[po\\pe\ for all pe e M^. 

Remark 1 From the statement i) the m-projection pe is 
characterized as the density va. M.^ with the equal ex- 
pectation of h{x) to that for p. Note that the following 
relation: 



9 Application (3) : Nonlinear diffusion 
equation 

Let u{x, t) and p{x, t) on R" x be, respectively, the so- 
lutions of the following nonlinear diffusion equation, which 
is called the porous medium equation (PME): 

Fin 

^ = Z\u", m>l (41) 
at 

with nonnegative initial data < u{x,Q) = uq{x) G L^(R"), 
and the associated nonlinear Fokker- Planck equation (NFFE): 

1^ = V • (/3xp + DW") , /3 > (42) 

OT 

with nonnegative initial data < p{x, 0) — po{x) E i^(R"). 
Here, D is a real symmetric positive definite matrix, which 
represents the diffusion coefficients. As is widely known 
|47l48j and shown later, solutions of the both equations 
are related with a simple transformation. 

The PME and NFFE with m > 1 represent the so- 
called slow diffusion phenomena, which naturally arises in 
many physical problems including percolation of a fluid 
through porous media and so on. See for [32133134135136] 
and the references therein. Hence the behaviors of their 
solutions have been extensively studied in both analytical 
and thermostatistical aspects in the literature |37l38l39l40l41l42l43l44li! 
just to name a few. 

In this section we demonstrate that several geometric 
concepts derived with generalized entropies in the previ- 
ous section are useful to investigate a new aspects of the 
behavior of the above equations. For the proofs of the re- 
sults see [16]. 

9.1 Several geometric properties of the porous 
medium and the associated Fokker-Planck equation 

Set (/)(m) = m'^, q > 0,q ^ 1, then we have the g-logarithmic 
and exponential functions [51j : 

ln^t^\n,t:= {t^-i ~ - q), 

exp^ t = exp, t [1 + (1 - q)t]'/^'-^^ . 

Consider the q-Gaussian density function defined by: 

f{x; 6, e) = exp^ {0^x + x^Ox - tfjie, O)) , (43) 
9 = {&') e R", = {9'^) e R"^", 

where 6> is a real symmetric negative definite matrix and 
i!{9, 0) is a normalizing constant. We denote by Ai the 
set of g-Gaussian densities, i.e., 

M -.^ {f{x-9,e)\9 R", > 6) = 6)^ e R"^"} . (44) 

For this setting, the corresponding generalized entropy 
and divergence are 



Thus, Pe achieves the maximum entropy among densities 
with the equal expectation of h{x). 
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r{x) 



■2-q 



-p{x) 



2-q 



V[p\\r] = '^-^^ dx, 

(46) 

In the sequel we fix the relation between the exponents 
of the PME and the parameter of g-exponential function 
hy m — 2 ~ q. Hence, we consider the case 1 < m < 2, or 
equivalently, < g < 1. Since we fix (t>{u) = u'^, we omit 
the subscripts 4> used to denote several quantities. By a 
suitable linear scaling of t we can consider the problem 
by fixing /? to an arbitrary constant. Hence, we fix (3 and 
introduce another constant fj, for notational simplicity as 
follows: ^ 

n[m — 1) + 2 

For the g-Gaussian family M, we can regard {9,0) 
as the canonical coordinates, and the first moment vector 
and second moment matrix [r], H) defined by 

rj :— J xf{x;0,O)dx, H := j xx'^ f{x;9,0)dx, 

as the expectation coordinates, respectively. 

We assume the u{x, 0) and p{x, 0) are nonnegative and 
integrablc function with finite second moments. When we 
consider the set of solutions, we restrict their initial masses 
to be normalized to one without loss of generalities. 

It is proved that there exists a unique nonnegative 
weak solution if m > [37| Theorem 5.1], and that the 
mass / u{x, t)dx is invariant for alH > if m > (n — 2)/n 

m- 

First of all, we review how the solutions of the PME 
and NFPE relate in the proposition below. Because of 
this fact the properties of the solution of the PME (|4ip 
are important to investigate those of the NFPE ([42]) and 
vise versa. 

Proposition 6 Let u{x, t) be a solution of the PME |^ j[ ) 

with initial data u{x,0) = uq(x) G L^CR"). Define 

p{z,t) := (t+l)^M(x,t), z := {t+iy'^Rx, r := ln(t+l), 



r{x)^ p(^)"'" , where the canonical parameters are given by 



then p{z, t) is a solution of 14S^ with V = Vz , D = RR^ 
and initial data p{z, 0) = Uo{R~^ z). 

Next, we find that the equilibrium density for the NFPE 
is on the g-Gaussian family M via Lyapunov approach. To 
analyze the behavior of (|42p let us define generalized free 
energy: 

Hp] ■■= I —x^D-^xp{x)dx - I\p] 
J 2m 



This type of functional was first introduced in |39I40| . We 
have 



dT[pix, t)] 1 



dT 



m 



J p\\PR-^x + mp"'-^RVpfdx < 0. 



(47) 

Thus, the equilibrium density Pod{x) is determined from 
([17|) as a g-Gaussian: 

Poo{x) = f{x; 0, Ooc) = expg{x'^0ooX - ■0(0, 6>oo)), (48) 



^00 = 0, e^^~^D-\ 

2m 

Note that we can express the difference of the free en- 
ergies of p(x) and the equilibrium Poo{x) G A4 hy the 
divergence: 

V[p\\poo] = ^{O,0oo)-I[p] - Ooo ■ Ep[xx^] 

= Hp] - Hpoo]- 

Thus, the minimization of J-[-] is equivalent to that of 

mpoo]. 

Finally, we show one of the fundamental properties 
of the PME and NFPE, which is important to state the 
sequel results in this paper. 

Proposition 7 The q- Gaussian family Ai is an invariant 
manifold for both PME and NFPE. 



9.2 Trajectories of m-projections 



Let 7?^^ = (r^f'^) and H^^ = {r]^^) be, respectively, the 
first moment vector and the second moment matrix, i.e., 

?7™(i) := E„[x,] = j xMx,t)dx, ?7™(i) := E„[x,a;,]. 



Theorem 1 Consider solutions of the PME with the com- 
mon initial first and second moments. Then their m-projections 
to M evolve monotonically along with the common m- 
geodesic curve that starts from the density determined by 
the initial moments. 

Outline of the proof) Differentiating rff^^ by i, we see that 
the second moments evolves as 

'7™(i) = r;P^(0) + %ar(i), 
a™(t) := 2 dt' J u{x,tTdx. 

Note that (7™(i) is positive and monotone increasing on 
t > 0. By similar argument we see that t)™ = 0, i.e., the 
first moment vector is invariant. From Proposition [4] and 

i) of Proposition [5l the statement follows. □ 

Remark 2 i) From the argument for the NFPE, we will 
see that cr™(t) = 0(^^/3) as t ^ oo. 

ii) The theorem implies that the trajectories of m-projections 
on Ai for all the PME solutions u{x, t) are parallelized in 
the expectation coordinates, i.e.. 



^™(0), 
i/P^(0) 



(49) 
(50) 



where / denotes the identity matrix. See |16j for the ar- 
gument on the constants of motion. 
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Fig. 2. A solution u{x,t) of the PME, its m-projection u(x,t) 
and the Barenblatt-Pattle solution n^^ix^t) on 

Let ff){x) £ M. he the m-projection of the density 
fo{x). Consider two solutions ui{x,t) and U2{x,t) of the 
PME satisfying ui{x,to) = fo{x) and U2{x,to) = fo{x) 
for a certain io- From the moment conservation property 
of the m-projection stated in Proposition [5l the second 
moment matrices Hf^{t) of Ui{x,t) for i — 1,2 satisfy 
i?™(to) = -ff™(io)- However, their velocities at to have 
the relation: 

fff^(to) - ij2™(io) - 2 J fS'ix) ~ fS\x)dx I 

= 2TO(m-l)(x[/o]-X[/o])/ 

from (|50p and the expression of the generalized entropy 
(j45|) . Using the relation in Remark [U we have the follow- 
ing: 

Corollary 1 Let fo{x) € M be the m-projection of a 
density /o(x) and assume that two solutions ui{x,t) and 
U2{x, t) of the PME satisfy the conditions ui{x, to) — fo{x) 
and M2(x,io) = fo{x) at t = to. Then velocities of their 
respective second moment matrices at to are related by 

Hnto) - i/2™(io) = 2m(m - l)I?[/o||/o]/. 

Thus, the m-projection ui{x, t) of ui{x, t) ^ M, which 
has the common second moment matrix Hf^{t) for all 
t, evolves faster than U2{x,t) G while ui{x,t) and 
U2{x, t) have the common trajectory on M. by Theorem [1] 
(See Figure [3]) . The corollary suggests that by measuring 
the diagonal elements of we can estimate how far 

ui{x,t) is from M. in terms of the divergence. Note that 
the difference of velocities vanishes when m — > 1. Hence, 
this is the specific property of the slow diffusions governed 
by the PME. 

Let rf^{T) and H^^{t) be, respectively, the first and 
the second moments of p(a;,T), i.e., 

77^P=Ej,[x], ijFP = Ep[a;x^]. 

From the behavior of the moments of the PME and the 
above relations of moments, we have 




0.000 0.002 0.004 0.006 0.008 0.010 
time 




0.000 0.002 0.004 0.006 0.008 0.010 
time 



Fig. 3. Evolutions of the second moments of ui{x,t) M 
and U2{x,t) £ M with the same initial moments for the one- 
dimensional PME (m = 1.9) (Above) and the corresponding 
velocities (Below). 

where the scaling r = ln(t -I- 1) is assumed and <Jp^{t) is 
defined by 

.ln(l+t) f 

al^{t) := 2 / dr'e" +^'(1-™)- / p(x,T')™dx 

= det(i?)a™(t). 

for a solution u of the PME and the corresponding solution 
p of the NFPE. Note that differentiating the above by t, 
we have the relation: 

(i + ,r<.-»../p,.,.r...de.m/„(..,,r.... (m) 

For the limiting case m — *■ 1 (and accordingly (3 — > 
1/2), we see that the above expressions recover the well- 
known linear Fokker-Plank case with a drift vector 2;/2: 

rf^(^r) = e-/^r (0), 
H^^{t) = e-^i/PP(0) + 2(1 - e-^)D. 

Since we know that p(x,t) converges to Poc{x) € in 
(|48l) and it holds that 

lim H^^{t) = Vd^5t~D f lim {t + l)"^^cT™(t)) D (52) 

because det R = Vdet D, we conclude that the left-hand 
side of dig) exists and cr™(t) = 0{t^'^) as t ^ oo (Cf. 
Remark [2]) . Summing up the above with Proposition [H 
we obtain the following geometric property of the NFPE: 

Corollary 2 Consider solutions of the NFPE with the 
common initial first and second moments. Then their m- 
projections to M evolve along with the common m-geodesic 
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curve approaching from the density determined by the ini- 
tial moments to the equilibrium poo{x). 

9.3 Convergence rate of the solution of the PME to 

M 

The following proposition is obtained from ii) of Propo- 
sition [5] and a result [45l48j claiming that a solution of 
the NFPE decays exponentially with respect to the diver- 
gence, i.e., 

V[p{x,t)\\poo{x)] < V[p{x,0)\\p^{x)]e-^''\ (53) 

Proposition 8 Let u{x,t) be a solution of the PME and 

u{x,t) be the m-projection of u{x,t) to the q-Gaussian 
family A4 at each t. Thenu{x,t) asymptotically approaches 
to M with 

V[u{x,t)\\u{x,t)] < 

where Cq is a constant depending on the initial function 
u{x, 0). 

Remark 3 Combining this result and the Csiszar-KuUback 
inequality [45j . we can also conclude the convergence 
of u{x,t) to M with the rate l/^/l + 1. This imphes that 
the convergence to M is faster than l/t^^ which is the 
convergence rate to the self-similar solution of the PME 
in the case 1 < m < 2 |47l48j . 

10 Conclusions 

We have discussed the Legendre duality of generalized en- 
tropies and its applications focusing on the duality relation 
([1]) of the statistical manifold. Within this framework, we 
classified extensions into two major directions using the 
representation functions. In terms of corresponding geo- 
metric structure, we can say that the one is characterized 
by nonflatness with constant curvature, while the other is 
by dual flatness. 

The important point would be how such a geometric 
setup is useful for our understandings of various physical 
phenomena not obeying the standard statistical theory. 
For this purpose we have partially presented the recent 
results on the solutions of the PME and the NFPE. How- 
ever, the whole picture of the proposed generalization in 
Section [5] is still largely formal and it needs more devel- 
opments on the basis of physical background. 
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Appendix 

A Statistical manifolds and affine differential 
geometry 

In this appendix we briefly summarize several concepts 
and properties of statistical manifolds and affine hyper- 
surfaces, which are necessary in this paper. See for details 
[71819] ■ respectively. 

A.l Statistical manifold 

For a torsion-free affine connection V and a pseudo Rie- 
mannian metric g on a manifold A4, the triple {A4,g,V) 
is called a statistical manifold if it admits another torsion- 
free connection V* satisfying 

XgiY,Z)^g{S/xY,Z)+g{Y,V*xZ) (54) 

for arbitrary X, Y and Z in X{Ai), where X{Ai) is the set 
of all tangent vector fields on Ai. We call V and V* duals 
of each other with respect to g, and {M,g, V*) is said dual 
statistical manifold of V). The triple of a Rieman- 

nian metric and a pair of dual connections {g, V, V*) sat- 
isfying (j54p is called dualistic structure on A4 , which plays 
a fundamental role in the study of manifolds of probability 
distributions. 

A statistical manifold {A4,g,W) is said to have con- 
stant curvature k if the curvature tensor i? of V satisfies 

R{X,Y)Z ^ K{g(Y,Z)X - g{X,Z)Y}. (55) 

When the constant k is zero, the statistical manifold is 
said to be flat, or dually flat, because the curvature tensor 
R* of V* is known to vanish automatically. 

A. 2 Affine hypersurface theory 

Let M be an n-dimensional manifold and consider an 
affine immersion {f,£,), which is the pair of an immer- 
sion / of into R"+^ and a transversal vector field ^ 
to f{M). We denote by Dxf*{Y) the covariant deriva- 
tive along / induced by the standard flat connection D 
of R"+^. By a given affine immersion (/, ^) of A^, the 
Gauss and Weingarten formulas are respectively obtained 
as follows: 

Dxh{Y) = WsIxY) + h{X,Y% (56) 
Dxi = -f.{SX) + T{X)(,. (57) 



Here, V ,h, S and r determined from the above formu- 
las are called, respectively, the induced connection, affine 
fundamental form, affine shape operator and transversal 
connection form [9]. By regarding /i as a (pseudo-) Rie- 
mannian metric of A^, we say that the affine immersion 
realizes (7W,/i, V) in R"+i. 

An afhne immersion is said nondegenerate and equiaffine 
if h is nondegenerate and t = 0, respectively. Further, let o 
be the origin of R"+^. Then we say that the afiine immer- 
sion is centro-affine with a scaling constant p if ^ at f{x) 

is equal to — p times of the vector of{x) for a constant p 
and X ^ A4. 

The following facts hold [9], which are convenient to 
know the property of the realized manifold by an affine 
immersion: 

Fl) An equiaffine and nondegenerate affine immersion re- 
alizes a statistical manifold, 

F2) A centro-affine immersion with a scaling constant p 
is equiaffine with S = pi. The realized manifold is 
projectively flat, 

F3) An equiaffine and nondegenerate affine immersion with 
S = pi realizes a statistical manifold with constant 
curvature p. 

Let (Al, g,V) be a statistical manifold with constant 
curvature, which is realized by a centro-affine immersion 
with constant scaling. It is known that (Al, 5, V) has the 
following properties |8|9j : 

PI) Modified Pythagorean relation for the geometric di- 
vergence (contrast function) holds on M, 
P2) Let 7 be a V-geodesic curve joining two points x and 

y on A^ . Then, 7(7) is the intersection of /(A^) and the 

1> o 

two-dimensional plane containing o/(a;) and o/(y) in 

R"+i. 

The corresponding results also hold for the dual statistical 
manifold [Ai, g,V*) with constant curvature. 

B Proof of Proposition [3] 

First, recall that I?^"' is the restriction of T> in ([22]) defined 
on R"+^xR"+^. Hence, in the ambient space (R"+^, D), 

the gradient vector X of V{6, 6{r)) a,t 6 e R"+^ is repre- 
sented by 

n+l o 

i=l 

where 

n+l 

= ^g'^iVjir) -Vj), i = 1, ••■,"+ 1, 

i=i 

and g^^ is the component of the inverse of the Rieman- 
nian metric g = (d^ip/dd'^dO^) given in ([SS]) . Hence, the 
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gradient vector X is represented in the r?-coordinates by 

and its integral curve is 

= e"*(??i(Po) - riiir)) + r]i{r). 
Next, consider the vector field defined by 

n+1 „ n+1 „ n + 1 „ 

then A'' is orthogonal to Tp<S" at each p e 5" with respect 

to g. It is verified that the gradient vector X is represented 
as the orthogonal projection of X onto Tp5" along N . 

Combining the above two facts, wo sec that the inte- 
gral curve (gradient flow) of X is restricted to the two- 
dimensional plane V* containing the three points viPo), 
ri{r) and the origin o in (R"+^)* with the ry-coordinate 
system. Thus, the second statement follows. Note that 
is represented as the level surface <f{r}) = 2/(1 — a) in 
(R""*"-*^)*, then wc conclude that the gradient flow is ac- 
tually represented by the intersection of the level surface 
and V* ■ This proves the first statement owing to the prop- 
erty (P2) in Appendix A for the dual statistical manifold 
(5",5,V(-")). □ 



